Replication data for "The Promises and Pitfalls of 311 Data", Urban Affairs Review
Ariel White and Kris-Stella Trump
email Ariel (arwhi@mit.edu) with questions
January 2017

This folder contains data and code to reproduce all the analyses produced in the article and appendices.  All code has been tested in R 3.3.1.

One script pulls in raw 311 call data and a range of covariates, and cleans/prepares the data.  Then several scripts run analysis and produce plots. Individual scripts are described below.

Please note: we have included several .zip files that contain entire folders of census data (shapefiles, etc.) that will need to be unzipped for some of the code to run properly, particularly for steps 0 and 1.



***
*** Step 0. Geocode and neatly format the political donations data used in the following scripts. 
***

Both the raw data and the script for this step are provided in the zipped file "Donations.zip”. Note that none of this needs to be unzipped or run for the remaining code to work, as we've also included the resulting dataset. We have included this part in case you want to see what we did or play around with it.

**Script: 

"donations_fileprep.R"

**Input files: 

Raw donations data "contributions.nimsp.2010.csv" (source: http://data.influenceexplorer.com/bulk/)
NYC tract shapefiles “"tl_2010_36_tract10” (source: https://www.census.gov/geo/maps-data/data/cbf/cbf_tracts.html)

**Output data files: 

Tract-level donations data “NYCdonations2010_tractlevel.Rdata”

**Output analysis: 

NA





***
*** Step 1. Pull in raw NYC 311 calls database; use census tract and precinct shapefiles to aggregate calls to neighborhoods. 
***

This step produces several plots and tables characterizing the 311 call dataset, and also outputs the tract- and precinct-level datasets that will be pulled in and used by later scripts. 

**Script: 

"NYC311_data_prep_merge_code.R"

**Input files: 

NYC call data "311_Service_Requests_from_2010_to_Present.csv" (source: https://data.cityofnewyork.us/Social-Services/311-Service-Requestsfrom-2010-to-Present/erm2-nwe9)
NYC tract shapefiles “"tl_2010_36_tract10” (source: https://www.census.gov/geo/maps-data/data/cbf/cbf_tracts.html)
NYC precinct shapefiles "ny_final" (source: http://www.latfor.state.ny.us/data/)
Typology of call codes "NYCcalltypes_coded.csv" (authors’ own) 
NYC tract-level census return data "NYCtract2014censusplanningdb.Rdata” (source: http://www.census.gov/research/2012_planning_database/)
Tract-level population data "./censuspop_tracts/DEC_10_SF1_SF1DP1_with_ann.csv" (source: American FactFinder)
Precinct level vote data for NY State "NY_2010_precinctvote.csv" (source: https://dataverse.harvard.edu/dataset.xhtml?persistentId=hdl:1902.1/16320)
Precinct level registration data "vtd10vote.csv" (source: http://www.latfor.state.ny.us/data/?sec=2010vote)

**Output data files: 

Tract level all calls and census returns data ”merged_NYCcensustractdata.csv”; 
Tract level all calls and census returns and political donations data "merged_NYCcensustractdata_poldonations.csv"
Tract level public calls and census returns data "merged_NYCcensustractdata_publiconly.csv"
Tract level public calls and census returns and political donations data "merged_NYCcensustractdata_poldonations_publiconly.csv"
Tract level street calls and census returns data "merged_NYCcensustractdata_streetonly.csv"
Tract level street calls and census returns and political donations data "merged_NYCcensustractdata_poldonations_streetonly.csv"
Precinct level all calls and voting data “precinctlevel311votingdata_mergeddec14.csv”
Precinct level public calls and voting data "precinctlevel311votingdata_mergeddec14_publiconly.csv"
Precinct level street calls and voting data "precinctlevel311votingdata_mergeddec14_streetonly.csv"

**Output analysis: 

Figure 1 ("NYC311_callvolume_2010on.pdf”)
Table 2 ("top25NYCcomplaints_2010on.tex”)




***
***Step 2. Voter turnout analysis.  
***

Pull in the precinct-level call data produced in Step 1, and perform the voter turnout analysis shown in the paper. This can be run without running the earlier scripts, as we've also included the datasets from Step 1.

**Script: 

"NYC311_precinct_analysis.R"

**Input files: 

Precinct level all calls and voting data “precinctlevel311votingdata_mergeddec14.csv” (source: Step 1)
Precinct level public calls and voting data "precinctlevel311votingdata_mergeddec14_publiconly.csv" (source: Step 1)
Precinct level street calls and voting data "precinctlevel311votingdata_mergeddec14_streetonly.csv" (source: Step 1)

Precinct data (http://www.latfor.state.ny.us/data/)

**Output data files:
NA

**Output analysis: 

Figure 3("calls_turnout_precincts_allcallsets.pdf”)
Appendix Table 4: see calls to stargazer 





***
*** Step 3. Census return and political donations analysis.
***

Pull in the tract-level call data from Step 1, and perform the census return and political donations analysis shown in the paper.  This can be run without running the earlier scripts, as we've also included the datasets from Step 1.

**Script: 

"NYC311_tract_analysis.R"

**Input files:

Tract level all calls and political donations data "merged_NYCcensustractdata_poldonations.csv" (source: Step 1)
Tract level public calls and political donations data "merged_NYCcensustractdata_poldonations_publiconly.csv" (source: Step 1)
Tract level street calls and political donations data "merged_NYCcensustractdata_poldonations_streetonly.csv" (source: Step 1)

**Output data files:

Call aggregates at tract level for producing maps ”tractvolumes_allcalls.csv"

**Output analysis: 

Figure 4 (“plot_donationcounts_tracts_allcalltypes.pdf”)
Figure 5 ("plot_censusreturn_tracts_allcalltypes.pdf")
Figure 6 ("addingcovars_censusresponse_allcalls_3mos.pdf”)
Figure 7 (“addingcovars_poldonations_allcalls_3mos.pdf”) 
Figure 8 ("needand311_poverty_alltimesallcalls.pdf”)
Figure 9 ("needand311_medHHincome_alltimesallcalls.pdf”)
Appendix tables 2, 3, and 5: see calls to stargazer.
Appendix figure “Political Donations and 311 calls” ("addingcovars_poldonations_allcalls_3mos_with0.pdf”)




***
***Step 4. Use tract-level call volumes to produce maps shown in paper and appendix.
***

This step uses prepped data from Step 3 and precinct and tract shapefiles. 

**Script: 
"map311callvolumes.R" 

**Input files: 
Call aggregates at tract level for producing maps ”tractvolumes_allcalls.csv" (source: Step 3)
NYC tract shapefiles “"tl_2010_36_tract10” (source: https://www.census.gov/geo/maps-data/data/cbf/cbf_tracts.html)
Precinct shapefiles (“NYprecinctshapefiles) (source: http://www.latfor.state.ny.us/data/)

**Output data files:
NA

**Output analysis: 
Figure 2 ("censustractcallvolumes_2010.pdf” and "censustractcallvolumes_2010percapita.pdf")
Figure A1 in Appendix ( "censustractcallvolumes_2010.pdf” and "censustractHHincomes_2010.pdf”)
